12 research outputs found
The HyperBagGraph DataEdron: An Enriched Browsing Experience of Multimedia Datasets
Traditional verbatim browsers give back information in a linear way according
to a ranking performed by a search engine that may not be optimal for the
surfer. The latter may need to assess the pertinence of the information
retrieved, particularly when she wants to explore other facets of a
multi-facetted information space. For instance, in a multimedia dataset
different facets such as keywords, authors, publication category, organisations
and figures can be of interest. The facet simultaneous visualisation can help
to gain insights on the information retrieved and call for further searches.
Facets are co-occurence networks, modeled by HyperBag-Graphs -- families of
multisets -- and are in fact linked not only to the publication itself, but to
any chosen reference. These references allow to navigate inside the dataset and
perform visual queries. We explore here the case of scientific publications
based on Arxiv searches.Comment: Extension of the hypergraph framework shortly presented in
arXiv:1809.00164 (possible small overlaps); use the theoretical framework of
hb-graphs presented in arXiv:1809.0019
Hyper-bag-graphs and their applications: Modeling, Analyzing and Visualizing Complex Networks of Co-occurrences
Obtaining insights in the tremendous amount of data in which the Big Data era has brought us, requires to develop specific tools, that are not only summaries of data through classical charts and tables, but that allow full navigation and browsing of a dataset. The proper modeling of databases can enable such navigation and we propose in this Thesis a methodology to achieve the browsing of an information space, through its different facets. To achieve the modeling of such an information space, co-occurrences of data instances are built referring to a common reference type. Historically, the co-occurrences were seen as pairwise relationships and developed as such. The move to hypergraphs enables the possibility to take into account the multi-adicity of the relationships, and to have a representation through the incident graph that simplifies deeply its 2-section. Nonetheless, representing large hypergraphs calls for a coarsening of the information by having insights on important vertices and hyperedges. One classical way to achieve it is to use a diffusion process over the network. Achieving it using an incident matrix is feasible but brings us to a pitfall, as it brings us back to a pairwise relationship. Making proper diffusion requires a tensor approach. This is well known for uniform hypergraphs, where all the hyperedges have same cardinality, but still very challenging for general hypergraphs. After redefining the concept of adjacency in general hypergraphs, we propose a first e-adjacency tensor, that involves a Hypergraph Uniformisation Process and a Polynomial Homogenization Process. This is achieved by uniformisation of the original hypergraph by decomposing it into layers—each of them containing a uniform hypergraph—and filling each layer with additional special vertices and merging them together. This process requires to have as many additional vertices as the number of layers. In order to reduce the number of special vertices, we need to have the possibility of re- peating a vertex when filling, which is not possible with hyperedges as they are sets. We need multisets. It, therefore, requires a new mathematical structure, that we have intro- duced and called hyper-bag-graph—hb-graph for short—, which is a family of multisets of a given universe. Co-occurrences can also have repetitions or individual weighting of their vertices inside a given co-occurrence and hb-graphs fit to handle it. Hence, we introduce a hb-graph framework for co-occurrence networks. We then work on diffusion on such structures, using, in a first step, a matrix approach. Aggregating the ranking of vertices and hb- edges of this diffusion on each of the facet of the information space is achieved by using a multi-diffusion scheme. Since different facets might have different focus of interest, we introduce a biased diffusion that enables a tuning on the point of emphasis on the feature we are interested in. Finally, coming back to e-adjacency tensor, we propose three e-adjacency tensors of hb- graphs, that are based on different ways of filling the hb-edges. The m-uniformisation that is achieved is evaluated and compared to the ones achieved by hb-edge splitting, concluding that any m-uniformisation process has an influence on the exchange-based diffusion that we propose. Hence, we conclude that diffusion using the tensor approach must be done in an informed manner to account for this diffusion change. We finally discuss different possible of achieving it and present a new Laplacian that can help to achieve it
Guide thérapeutique de chimiothérapie anticancéreuse pour les carnivores domestiques
LYON1-BU Santé (693882101) / SudocSudocFranceF
Exchange-based diffusion in Hb-Graphs: Highlighting complex relationships in multimedia collections
Highlighting important information of a network is commonly achieved by using random
walks related to diffusion over such structures. Complex networks, where entities can have
multiple relationships, call for a modeling based on hypergraphs. But, the limitation of
hypergraphs to binary entities in co-occurrences has led us to introduce a new mathematical structure called hyperbaggraphs, that relies on multisets. This is not only a shift in the
designation but a real change of mathematical structure, with a new underlying algebra. Diffusion processes commonly start with a stroke at one vertex and diffuse over the network.
In the original conference article—(Ouvrard et al. 2018)—that this article extends we have
proposed a two-phase step exchange-based diffusion scheme, in the continuum of spectral
network analysis approaches, that takes into account the multiplicities of entities. This diffusion scheme allows to highlight information not only at the level of the vertices but also at
the regrouping level. In this paper, we present new contributions: the proofs of conservation
and convergence of the extracted sequences of the diffusion process, as well as the illustration of the speed of convergence and comparison between classical and modified random
walks; the algorithms of the exchange-based diffusion and the modified random walk; the
application to two use cases, one based on Arxiv publications and another based on Coco
dataset images. All the figures have been revisited in this extended version to take the new
developments into account
Exchange-Based Diffusion in Hb-Graphs: Highlighting Complex Relationships
Most networks tend to show complex and multiple relationships between entities. Networks are usually modeled by graphs or hypergraphs; nonetheless a given entity can occur many times in a relationship: this brings the need to deal with multisets instead of sets or simple edges. Diffusion processes are useful to highlight interesting parts of a network: they usually start with a stroke at one vertex and diffuse throughout the network to reach a uniform distribution. Several iterations of the process are required prior to reaching a stable solution. We propose an alternative solution to highlighting main components of a network using a diffusion process based on exchanges; it is an iterative two-phase step exchange process. This process allows to evaluate the importance not only at the vertices level but also at the regrouping level. To model the diffusion process, we extend the concept of hypergraphs that are family of sets to family of multisets, that we call hb-graphs.Most networks tend to show complex and multiple relationships between entities. Networks are usually modeled by graphs or hypergraphs; nonetheless a given entity can occur many times in a relationship: this brings the need to deal with multisets instead of sets or simple edges. Diffusion processes are useful to highlight interesting parts of a network: they usually start with a stroke at one vertex and diffuse throughout the network to reach a uniform distribution. Several iterations of the process are required prior to reaching a stable solution. We propose an alternative solution to highlighting the main components of a network using a diffusion process based on exchanges: it is an iterative two-phase step exchange process. This process allows to evaluate the importance not only of the vertices but also of the regrouping level. To model the diffusion process, we extend the concept of hypergraphs that are families of sets to families of multisets, that we call hb-graphs. This version is an extended version of arXiv:1809.00190v1: the overlaps with the v1 are in black, the new content is in blue. The contributions of this extended version are: the proofs of conservation and convergence of the extracted sequences of the diffusion process, as well as the illustration of the speed of convergence and comparison to classical and modified random walks; the algorithms of the exchange-based diffusion and the modified random walk; the application to a use case based on Arxiv publications. All the figures except one have been either modified or added in this extended version to take into account the new developments